A hitchhiker's guide to CUDA programming
๐ฏGPU Kernels
Flag this post
Challenging the Fastest OSS Workflow Engine
๐งPTX
Flag this post
Opportunistically Parallel Lambda Calculus
๐กLSP
Flag this post
The next RISC-V processor frontier: AI
edn.comยท1d
๐ง CPU Architecture
Flag this post
Stable Video Infinity: Infinite-Length Video Generation with Error Recycling
โกFlash Attention
Flag this post
Async/Await is finally back in Zig
โฑ๏ธCUDA Events
Flag this post
Machine-learning predictive autoscaling for Flink
engineering.grab.comยท2d
โฑ๏ธCUDA Events
Flag this post
Rubin, Vera and the 1800-watt question: Nvidia shows off its future and prepares for the next AI storm
igorslab.deยท1d
โฑ๏ธCUDA Events
Flag this post
A Hybrid Reconstruction Framework for Efficient High-Order Shock-Capturing on Unstructured Meshes
arxiv.orgยท1d
โ๏ธCUTLASS
Flag this post
Show HN: GPU-accelerated sandboxes for running AI coding agents in parallel [video]
๐NCCL
Flag this post
How fast can an LLM go?
๐๏ธTensorRT
Flag this post
Inference Acceleration from the Ground Up
semiwiki.comยท3d
๐ฏTensor Cores
Flag this post
I tested Arc Raiders across four GPUs of different ages โ optimization still exists
xda-developers.comยท2h
๐งPTX
Flag this post
A portable picokernel for async I/O
๐Profiling Tools
Flag this post
Loading...Loading more...